nonconvex primal-dual splitting method
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
We study a stochastic and distributed algorithm for nonconvex problems whose objective consists a sum $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $\epsilon$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between {\it primal-dual} based methods and a few {\it primal only} methods such as IAG/SAG/SAGA.
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Reviews: NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
The idea of addressing finite-sum problems through augmented Lagrangian and ADMM type algorithm is straightforward but new. Convergence results of the new randomized variation for non-convex problems are also novel, to my best knowledge. Despite the notable originality, I think a couple of places need to be further discussed or clarified. Although existing work on nonconvex SVRG/SAGA algorithms [1,21] only provide analysis for smooth problems, I would presume similar results (replacing gradient by prox-gradient) can also be derived with slight modification given that the original algorithms are designed to handle nonsmooth regularization and adapted to non-uniform sampling as well. So it would be good if the author could add such comparison to nonconvex SVRG/SAGA in Table 1 when applicable. 2. It appears to me that in the NESTT-G algorithm (without reducing to the single variable form), at each iteration, setting the remaining (N-1) x variables to z would require O(Nd) computation and memory cost, which is not negligible.
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
Hajinezhad, Davood, Hong, Mingyi, Zhao, Tuo, Wang, Zhaoran
We study a stochastic and distributed algorithm for nonconvex problems whose objective consists a sum $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $\epsilon$-stationary solution using $\mathcal{O}((\sum_{i 1} N\sqrt{L_i/N}) 2/\epsilon)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between {\it primal-dual} based methods and a few {\it primal only} methods such as IAG/SAG/SAGA.
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
Hajinezhad, Davood, Hong, Mingyi, Zhao, Tuo, Wang, Zhaoran
We study a stochastic and distributed algorithm for nonconvex problems whose objective consists a sum $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $\epsilon$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between {\it primal-dual} based methods and a few {\it primal only} methods such as IAG/SAG/SAGA.
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)